Journal of Chemical Theory and Computation
● American Chemical Society (ACS)
All preprints, ranked by how well they match Journal of Chemical Theory and Computation's content profile, based on 126 papers previously published here. The average preprint has a 0.04% match score for this journal, so anything above that is already an above-average fit. Older preprints may already have been published elsewhere.
Kuehrova, P.; Mlynsky, V.; Otyepka, M.; Sponer, J.; Banas, P.
Show abstract
Biologically functional RNAs operate near marginal stability, and their rugged free-energy landscapes and profound structural dynamics - typically not captured by structural biology experiments - play decisive roles. Atomistic molecular dynamics (MD) simulations provide a unique means to characterize these features. However, the applicability of atomistic MD is currently limited by accessible simulation timescales and, most importantly, by force-field (FF) accuracy. Folding free energies ({Delta}G{degrees}fold) of small RNA motifs represent well-defined targets for quantitative benchmarking of RNA FFs. In practice, however, obtaining thermodynamic estimates that are sufficiently robust for direct comparison with experimental data remains highly challenging, even for small RNA systems, and many published studies rely on sampling that is not fully converged. Here, we systematically assess the performance of widely used advanced enhanced-sampling techniques using the 8-mer r(gcGAGAgc) tetraloop as a representative benchmark system. We test temperature replica exchange (T-REMD), two solute-tempering variants of replica exchange (REST2 and REHT), as well as well-tempered metadynamics and on-the-fly probability enhanced sampling combined with solute tempering (ST-MetaD and ST-OPES). Among the tested approaches, T-REMD proves to be the most robust, yielding reproducible folding equilibria and consistent estimates of {Delta}G{degrees}fold after approximately 20 s of simulation time, independent of the initial folded or unfolded conformational ensemble. Our results provide practical guidelines for selecting sampling protocols suitable for quantitative RNA benchmarks and lay the foundation for systematic validation and future refinement of RNA FFs.
Pooja, ; Bandyopadhyay, P.
Show abstract
Mutation in calcium-binding proteins (CBPs) can significantly influence Ca2+ binding affinity (BA), resulting in substantial impairment in the signaling process and leading to several lethal diseases. The knowledge behind the changes in the binding affinity can help in understanding the signaling process and designing inhibitors for therapeutic usage. However, accurate prediction of BA for a large number of mutations has been elusive. In this work, for an important calcium binding protein, cardiac Troponin-C, we have developed an integrative modeling approach that combines molecular dynamics (MD)-based binding free energy calculations, prediction of plausible mutants using evolutionary information, and an interpretable machine learning model to predict Ca2+ BA for a large number of mutations (seventy-six in all). For the binding free energy calculation, we have used a charge-scaling based MD simulation that considers the polarization in the system, which is critical for divalent ion binding with proteins. The well-known molecular mechanics Poisson-Boltzmann surface area (MM-PBSA) method was used for the binding free energy calculations. The calculated results for twenty-four disease mutants, which are associated with different cardiomyopathies and have experimental binding affinity, are in close agreement with the experimental results. To study other plausible mutations, we have probed the evolutionary landscape of cardiac Troponin-C and used the EVmutation method of Hopf et al.(Nature biotechnology 2017, 35, 128-135) to generate sixty-one additional mutants. Finally, a Support vector regression model was developed for both observed and plausible mutations. Our machine learning model used simple structure and sequence-based descriptors along with MD-based descriptors and gave a mean squared error (MSE) of only 0.16 kcal/mol. Assessment of the contribution of each descriptor shows that the number of water molecules within the Ca2+ binding site, type of amino acid substitution (e.g. polar to hydrophobic reduces the binding affinity), and the distance of mutation with Ca2+ are the most important factors in determining the binding affinity. This integrative modeling can be used for other CBPs and can lay the path for modeling the complex and astronomically large mutational landscape of Calcium-binding proteins.
Singh, B.; Martinez-Noa, Y.; perez, a.
Show abstract
Linear peptides play essential roles in biology and drug discovery, frequently mediating protein-protein interactions through short, flexible motifs. However, their structural plasticity--ranging from disordered to context-dependent folding--makes them challenging targets for molecular simulations. In this work, we benchmark the performance of twelve popular and emerging fixed-charge force fields across a curated set of twelve peptides spanning structured miniproteins, context-sensitive epitopes, and disordered sequences. Each peptide was simulated from both folded (200 ns) and extended (10 {micro}s) states to assess stability, folding behavior, and force field biases. Our analysis reveals consistent trends: some force fields exhibit strong structural bias, others allow reversible fluctuations, and no single model performs optimally across all systems. The study highlights limitations in current force fields ability to balance disorder and secondary structure, particularly when modeling conformational selection. These results offer practical guidance for peptide modeling and establish a benchmark framework for future force field development and validation in peptide-relevant regimes.
Hsueh, S. C. C.; Aina, A.; Plotkin, S. S.
Show abstract
Cyclic peptides naturally occur as antibiotics, fungicides, and immunosuppressants, and have been adapted for use as potential therapeutics. Scaffolded cyclic peptide antigens have many protein characteristics such as reduced toxicity, increased stability over linear peptides, and conformational selectivity, but with fewer amino acids than whole proteins. The profile of shapes presented by a cyclic peptide modulates its therapeutic efficacy, and is represented by the ensemble of its sampled conformations. Although some algorithms excel in creating a diverse ensemble of cyclic peptide conformations, they seldom address the entropic contribution of flexible conformations, and they often have significant practical difficulty producing an ensemble with converged and reliable thermodynamic properties. In this study, an accelerated molecular dynamics (MD) method, reservoir replica exchange MD (R-REMD or Res-REMD), was implemented in GROMACS-4.6.7, and benchmarked on three small cyclic peptide model systems: a cyclized segment of A{beta} (cyclo-(CGHHQKLVG)), a cyclized furin cleavage site of SARS-CoV-2 spike (cyclo-(CGPRRARSG)), and oxytocin (disulfide bonded CY-IQNCPLG). Additionally, we also benchmarked Res-REMD on Alanine dipeptide and Trpzip2 to demonstrate its validity and efficiency over REMD. Compared to REMD, Res-REMD significantly accelerated the ensemble generation of cyclo-(CGHHQKLVG), but not cyclo-(CGPRRARSG) or oxytocin. This difference is due to the longer auto-correlation time of torsional angles in cyclo-(CGHHQKLVG) v s. the latter two cyclic peptide systems; The randomly seeded reservoir in Res-REMD thus accelerates sampling and convergence. The auto-correlation time of the torsional angles can thus be used to determine whether Res-REMD is preferable to REMD for cyclic peptides. We provide a github page with modified GROMACS source code for running Res-REMD at https://github.com/PlotkinLab/Reservoir-REMD.
Cannariato, M.; Scaramozzino, D.; Lee, B. H.; Deriu, M. A.; Orellana, L.
Show abstract
The flexibility of DNA and RNA is known to play a central role in numerous biological processes, including chromatin organization and gene regulation. While a wide range of computational approaches have been developed to investigate the conformational dynamics and flexibility of proteins, analogous methods for nucleic acids remain comparatively underexplored. Elastic Network Models (ENMs) - coarse-grained mechanical representations in which macromolecules are modeled as networks of nodes connected by elastic springs - have been successfully applied to proteins, often allowing to capture experimentally observed conformational changes through a small number of harmonic normal modes. Building on a previously validated three-bead ENM for RNA, here we introduce edENM, an essential dynamics-refined ENM for DNA, RNA, and protein-nucleic acid complexes, parametrized using a diverse set of Molecular Dynamics simulations. The vibrational modes of the new edENM show good agreement with NMR data and experimental ensembles, while avoiding the unrealistic and localized deformability of previous ENM parametrizations. Additionally, we integrated this new edENM into eBDIMS, a Brownian Dynamics-based framework that enables the simulation of large-scale and anharmonic conformational transitions in protein assemblies. In this way, we are now able to explore functional motions in large protein-nucleic acid complexes such as chromatin subunits and ribosomes.
Wiebeler, C.; Falkner, S.; Schwierz, N.
Show abstract
Accurate ion force fields are essential for molecular dynamics simulations of biomolecular systems, particularly in combination with modern water models such as OPC. While OPC water improves the description of bulk water and biomolecules, the transferability of existing ion force fields to this model remains an open question. Here, we systematically assess the transferability of monovalent and divalent ion force field parameters (Li+, Na+, K+, Cs+, Mg2+,Ca2+, Sr2+, Ba2+, Cl- and Br-) to OPC water by comparing single-ion and ion-pairing properties with experimental data. Our analysis reveals that no single literature parameter set provides accurate results for all ions when directly transferred to OPC water. We hence introduce the MS/G-LB(OPC) force field, which combines Mamatkulov-Schwierz-Grotz cation parameters with Loche-Bonthuis anion parameters. MS/G-LB(OPC) reproduces hydration free energies, first-shell structural properties and activity derivatives at low salt concentrations. Our results demonstrate that transferring ion parameters to OPC can lead to significant and ion-specific deviations from experimental data, making careful validation essential. At the same time, the systematic transfer and combination of ion parameters from existing force fields can provide a practical and computationally efficient alternative to full reparameterization. MS/G-LB(OPC) is available at https://git.rz.uni-augsburg.de/cbio-gitpub/opc-ion-force-fields.
Yang, D.; Gronenborn, A.; Chong, L.
Show abstract
We developed force field parameters for fluorinated aromatic amino acids enabling molecular dynamics (MD) simulations of fluorinated proteins. These parameters are tailored to the AMBER ff15ipq protein force field and enable the modeling of 4, 5, 6, and 7F-tryptophan, 3F- and 3,5F-tyrosine, and 4F- or 4-CF3-phenylalanine. The parameters include 181 unique atomic charges derived using the Implicitly Polarized Charge (IPolQ) scheme in the presence of SPC/Eb explicit water molecules and 9 unique bond, angle, or torsion terms. Our simulations of benchmark peptides and proteins maintain expected conformational propensities on the s-timescale. In addition, we have developed an open-source Python program to calculate fluorine relaxation rates from MD simulations. The extracted relaxation rates from protein simulations are in good agreement with experimental values determined by 19F NMR. Collectively, our results illustrate the power and robustness of the IPolQ lineage of force fields for modeling structure and dynamics of fluorine containing proteins at the atomic level.
Sarkar, D. K.; Surpeta, B.; Brezovsky, J.
Show abstract
Given that most proteins have buried active sites, protein tunnels or channels play a crucial role in mitigating the transport of small molecules to the buried cavity for enzymatic catalysis. Tunnels can critically modulate the biological process of protein-ligand recognition. Various molecular dynamics methods have been developed for exploring and exploiting the protein-ligand conformational space to extract high-resolution details of the binding processes, one of the most recent represented by energetically unbiased high-throughput adaptive sampling simulations. The current study systematically contrasts the role of integrating prior knowledge while generating useful initial protein-ligand configurations, called seeds, for these simulations. Using a non-trivial system of haloalkane dehalogenase mutant with multiple transport tunnels leading to a deeply buried active site, these simulations were employed to derive kinetic models describing the process of association and dissociation of the substrate molecule. The more knowledge-based seed generation enabled high-throughput simulations that could more consistently capture the entire transport process, effectively explore the complex network of transport tunnels, and predict equilibrium dissociation constants, koff/kon, on the same order of magnitude as experimental measurements. Overall, the infusion of more knowledge into the initial seeds of adaptive sampling simulations could render analyses of transport mechanisms in enzymes more consistent even for very complex biomolecular systems, thereby promoting the rational design of enzymes with buried active sites and drug development efforts.
Blazhynska, M.; Ansari, N.; Lagardere, L.; Piquemal, J.-P.
Show abstract
Water molecules play a critical role in mediating protein-ligand interactions by forming bridging hydrogen bonds and contributing to ligand solvation. However, their intricate behavior, such as frequent exchange with bulk solvent or persistent stabilization in the binding site, makes the accurate binding free-energy estimation via molecular dynamics-based approaches challenging. Particularly, inadequate sampling of water reorganization might not only bias computed affinities but also obscure key interactions, making adequate rehydration of the binding site violated upon calculations. To address this, we employ the polarizable AMOEBA force field together with Lambda-ABF-OPES, an integrated enhanced-sampling framework, which combines{lambda} -dynamics, multiple-walker adaptive biasing force, and exploratory version of on-the-fly probability enhanced sampling technique. Such approach enables efficient rehydration of the binding site and allows robust sampling of water exchange and reorganization events without explicitly including any water-related collective variable. Applied to five watercontaining protein-ligand complexes with diverse ligand types and binding-site environments, the approach yields binding affinities efficiently and in good agreement with experimental data, demonstrating that Lambda-ABF-OPES captures dynamic water networks and provides robust and reproducible absolute binding free-energy estimation towards chemical accuracy.
Mandal, N.; Stevens, J. A.; Poma, A. B.; Surpeta, B.; Sequeiros-Borja, C.; Thirunavukarasu, A. S.; Marrink, S.-J.; Brezovsky, J.
Show abstract
Enzymes are pivotal to numerous biological processes, often featuring buried active sites linked to the surrounding solvent through intricate and dynamic tunnels. These tunnels are vital for facilitating substrate access, enabling product release, and regulating solvent exchange, which collectively influence enzymatic function and efficiency. Consequently, knowledge of tunnels is key for a holistic understanding of the effect of mutations as well as predicting drug residence times. Unfortunately, most transport tunnels are transient, i.e., equipped by molecular gates, rendering their opening a rare event that is often notoriously hard to study with conventional molecular dynamics simulations. To overcome the sampling limitation of such simulations, this study investigated the efficacy of three different coarse-grained (CG) molecular dynamics simulation methods for inferring enzyme tunnel structure and dynamics. Here, we covered the Martini and SIRAH models with different restraint protocols providing stability to CG proteins while to some extent biasing the sampling towards a reference structure. By contrasting CG results with all-atom simulations, we benchmarked the ability of CG methods to replicate ensemble characteristics of complex tunnel networks in haloalkane dehalogenase LinB and two of its mutants with engineered tunnel networks. The assessed tunnel parameters are essential for prioritizing functionally relevant tunnels and delineating the effect of mutations on transport tunnels. Our findings reveal that while CG methods significantly enhance the efficiency of tunnel analyses, some of them, like Martini with Elastic network restraints, were limited in recapitulating all-atom tunnel dynamics due to the structural bias applied. In contrast, the Martini G[o] model even captured the intricate details of mutation perturbing tunnel dynamics. All studied CG methods performed well in capturing the geometry of tunnel ensembles in line with all-atom simulations. Additionally, the wider applicability of CG methods was verified by analyzing tunnel networks of nine enzymes from different combinations of structural and functional classes, demonstrating their potential to uncover new tunnel phenomena and validate their utility in broader biological and functional contexts. This comprehensive evaluation underscores the strengths and constraints of CG simulations in capturing enzyme tunnels and benefiting from their computational speed for studying huge datasets of enzymes. These insights are valuable for enzyme engineering, drug design, and understanding enzyme function while benefitting from the efficiency of coarse-grained models.
Barron, M. P.; Wijeratne, H. R. S.; Runnebohm, A. M.; Caric, K. M.; Mosley, A. L.; Vilseck, J. Z.
Show abstract
Charge-changing perturbations are notoriously dificult to investigate with alchemical free energy calculations. The routine use of periodic boundary conditions and electrostatic approximations, such as particle-mesh Ewald (PME), may produce finite-size efect errors that become non-negligible as a perturbation changes a simulation cells net charge away from zero. Two prevalent strategies exist to correct for these errors: the analytic correction (AC) and co-alchemical ion (CI) methods. Both correction schemes have been found to produce comparable relative free energy results for small molecule perturbations, but these methods have not been compared using {lambda}-dynamics ({lambda}D) free energy calculations or for protein side chain mutations. Recently, we investigated relative folding and binding free energies ({Delta}{Delta}Gs) of a series of EXOSC3 variants involved in a rare neurodegenerative disorder, including D132A, G135R, and G191D charge-change perturbations, with a simplified AC scheme in {lambda}D. In this study, these perturbations are reevaluated with the CI scheme for comparison with AC to identify the best correction strategy for {lambda}D. The collected AC- and CI-corrected {Delta}{Delta}Gs show excellent agreement with a mean unsigned error of 0.4 kcal/mol. However, reduced sampling proficiency and increased dificulties of evaluating multisite perturbations with the CI method suggest that a simplified AC approach may be more generalizable for future {lambda}D calculations. Previously, the use of the CI approach with {lambda}D has been limited due to a lack of infrastructure available to users to simplify its more involved setup procedure. This study introduces an automated workflow for implementation of the CI approach with {lambda}D, laying the foundation for future comparisons between charge-change correction schemes. These studies facilitated analysis of the {lambda}D trajectories to identify structural changes within EXOSC3 and the RNA exosome complex that clearly rationalize the calculated {Delta}{Delta}Gs for the D132A, G135R, G191C, and G191D EXOSC3 variants, providing insight into potential disease-causing mechanisms of EXOSC3 modifications.
Mlynsky, V.; Kuehrova, P.; Bussi, G.; Otyepka, M.; Sponer, J.; Banas, P.
Show abstract
Understanding RNA structural dynamics is essential for elucidating its biological functions, and molecular dynamics (MD) simulations provide an important atomistic complement to experimental approaches. However, the predictive power of MD is fundamentally limited by the accuracy of the underlying empirical Force Fields (FFs), particularly in capturing the delicate balance of non-bonded interactions. Here, we present a systematic reparameterization strategy that replaces the external gHBfix19 hydrogen-bond (H-bond) correction potential with an equivalent set of NBfix Lennard-Jones modifications within a state-of-the-art RNA FF. Using a quantitatively converged temperature replica-exchange MD ensemble of the GAGA tetraloop, we employed a reweighting-based optimization protocol to derive NBfix parameters that reproduce the thermodynamic effects of the original gHBfix19 terms. Sequential optimization of individual gHBfix19 components proved essential to ensure stable and transferable parameter refinement. The resulting fully reformulated NBfix-based variant, termed OL3CP-NBfix19, was validated on a representative set of RNA motifs, including tetranucleotides, A-form duplexes, and tetraloops. Across all tested systems, its performance is comparable to that of the reference gHBfix19 FF. By embedding the H-bond corrections directly into the standard non-bonded framework, the NBfix formulation eliminates external biasing potentials, simplifies practical deployment, and reduces computational overhead. Beyond this specific reparameterization, our results demonstrate a practical workflow for translating targeted H-bond corrections into native FF terms for efficient biomolecular simulations.
Jana, K.; Kepp, K. P.
Show abstract
Predicting protein structure from sequence is a central challenge of biochemistry, yet different force fields feature distinct structural biases that are hard to quantify, preventing clear assessment of results. Since structural transitions occur on milliseconds to seconds, sampling is out of reach in almost all routine studies, we inherently rely on local sampled structures, and benchmarks have emphasized the ability to reproduce these local structures. Here we approach the force field bias problem in a different way, via alternatives, by revisiting the old question: How unique is the sequence-structure relationship when studied computationally? To circumvent the sampling problem, the system-bias (specific structure choices affect apparent force field structural preference) and the complexity of tertiary structure, we studied ten small - and {beta}-proteins (20-35 amino acids) with one helix or sheet. For each of the ten sequences, we then designed alternative {beta}- or -structures and subjected all 20 proteins to molecular dynamics simulations. We apply this "alternative structure" benchmark to five of the best modern force fields: Amber ff99SB-ILDN, Amber ff99SB*-ILDN, CHARMM22*, CHARMM36, and GROMOS54A8. Surprisingly, we find that all sequences with reported {beta}-structures also feature stable native-like -structures with all five force fields. In contrast, only the alternative {beta}-1T5Q and to some extent {beta}-1CQ0 and {beta}-1V1D resembled native {beta}-proteins. With full phase space sampling being impossible in almost all cases, our benchmark by alternatives, which samples another local part of phase space in direct comparison, is a useful complement to millisecond benchmarks when these become more common.
Rufa, D. A.; Bruce Macdonald, H. E.; Fass, J.; Wieder, M.; Grinaway, P. B.; Roitberg, A. E.; Isayev, O.; Chodera, J. D.
Show abstract
Alchemical free energy methods with molecular mechanics (MM) force fields are now widely used in the prioritization of small molecules for synthesis in structure-enabled drug discovery projects because of their ability to deliver 1-2 kcal mol-1 accuracy in well-behaved protein-ligand systems. Surpassing this accuracy limit would significantly reduce the number of compounds that must be synthesized to achieve desired potencies and selectivities in drug design campaigns. However, MM force fields pose a challenge to achieving higher accuracy due to their inability to capture the intricate atomic interactions of the physical systems they model. A major limitation is the accuracy with which ligand intramolecular energetics--especially torsions--can be modeled, as poor modeling of torsional profiles and coupling with other valence degrees of freedom can have a significant impact on binding free energies. Here, we demonstrate how a new generation of hybrid machine learning / molecular mechanics (ML/MM) potentials can deliver significant accuracy improvements in modeling protein-ligand binding affinities. Using a nonequilibrium perturbation approach, we can correct a standard, GPU-accelerated MM alchemical free energy calculation in a simple post-processing step to efficiently recover ML/MM free energies and deliver a significant accuracy improvement with small additional computational effort. To demonstrate the utility of ML/MM free energy calculations, we apply this approach to a benchmark system for predicting kinase:inhibitor binding affinities--a congeneric ligand series for non-receptor tyrosine kinase TYK2 (Tyk2)--wherein state-of-the-art MM free energy calculations (with OPLS2.1) achieve inaccuracies of 0.93{+/-}0.12 kcal mol-1 in predicting absolute binding free energies. Applying an ML/MM hybrid potential based on the ANI2x ML model and AMBER14SB/TIP3P with the OpenFF 1.0.0 ("Parsley") small molecule force field as an MM model, we show that it is possible to significantly reduce the error in absolute binding free energies from 0.97 [95% CI: 0.68, 1.21] kcal mol-1 (MM) to 0.47 [95% CI: 0.31, 0.63] kcal mol-1 (ML/MM).
Grazzi, A.; Brown, C. M.; Sironi, M.; Marrink, S.-J.; Pieraccini, S.
Show abstract
Accessing deeply buried binding sites remains a major challenge in structure-based drug discovery, where accurate description of both protein dynamics and ligand binding pathways is required. Funnel metadynamics enables simulation of complete binding processes but is computationally demanding at the all-atom resolution. By adopting the Martini 3 force field, coarse-grained funnel metadynamics (CG-FMD) substantially reduces computational requirements while retaining enhanced sampling capabilities. In this work, we assess the capability of CG-FMD to model ligand recognition at the deeply buried colchicinoids site of the tubulin {beta}-heterodimer, a multisite protein of strategic importance. We investigated the binding of colchicine, podophyllotoxin and combretastatin-A4, recovering free energy profiles with improved statistical convergence compared to AA-FMD and comparable to experimental references. In particular CG-FMD binding free energies present mean absolute errors between 3 and 10 kJ mol-1. These results propose CG-FMD as an efficient, physics-based framework for probing ligand binding to challenging sites.
Lee, S.; Wang, D.; Seeliger, M.; Tiwary, P.
Show abstract
Understanding drug residence times in target proteins is key to improving drug efficacy and understanding target recognition in biochemistry. While drug residence time is just as important as binding affinity, atomiclevel understanding of drug residence times through molecular dynamics (MD) simulations has been difficult primarily due to the extremely long timescales. Recent advances in rare event sampling have allowed us to reach these timescales, yet predicting protein-ligand residence times remains a significant challenge. Here we present a semi-automated protocol to calculate the ligand residence times across 12 orders of magnitudes of timescales. In our proposed framework, we integrate a deep learning-based method, the state predictive information bottleneck (SPIB), to learn an approximate reaction coordinate (RC) and use it to guide the enhanced sampling method metadynamics. We demonstrate the performance of our algorithm by applying it to six different protein-ligand complexes with available benchmark residence times, including the dissociation of the widely studied anti-cancer drug Imatinib (Gleevec) from both wild-type Abl kinase and drug-resistant mutants. We show how our protocol can recover quantitatively accurate residence times, potentially opening avenues for deeper insights into drug development possibilities and ligand recognition mechanisms. TOC Graphic O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=107 SRC="FIGDIR/small/589710v1_ufig1.gif" ALT="Figure 1"> View larger version (27K): org.highwire.dtl.DTLVardef@11dc13borg.highwire.dtl.DTLVardef@79103dorg.highwire.dtl.DTLVardef@194b67org.highwire.dtl.DTLVardef@a570e3_HPS_FORMAT_FIGEXP M_FIG C_FIG
Yang, S.; Song, C.
Show abstract
Proteins are inherently dynamic molecules, and their conformational transitions among various states are essential for numerous biological processes, which are often modulated by their interactions with surrounding environments. Although molecular dynamics (MD) simulations are widely used to investigate these transitions, all-atom (AA) methods are often limited by short timescales and high computational costs, and coarse-grained (CG) implicitsolvent G[o]-like models are usually incapable of studying the interactions between proteins and their environments. Here, we present an approach called Multiple-basin G[o]-Martini, which combines the recent G[o]-Martini model with an exponential mixing scheme to facilitate the simulation of spontaneous protein conformational transitions in explicit environments. We demonstrate the versatility of our method through five diverse case studies: GlnBP, Arc, Hinge, SemiSWEET, and TRAAK, representing ligand-binding proteins, fold-switching proteins, de novo designed proteins, transporters, and mechanosensitive ion channels, respectively. The Multiple-basin G[o]-Martini offers a new computational tool for investigating protein conformational transitions, identifying key intermediate states, and elucidating essential interactions between proteins and their environments, particularly protein-membrane interactions. In addition, this approach can efficiently generate thermodynamically meaningful datasets of protein conformational space, which may enhance deep learning-based models for predicting protein conformation distributions.
Yang, S.; Song, C.
Show abstract
Proteins are dynamic biomolecules that can transform between different conformational states when exerting physiological functions, which is difficult to simulate by using all-atom methods. Coarse-grained G[o]-like models are widely-used to investigate large-scale conformational transitions, which usually adopt implicit solvent models and therefore cannot explicitly capture the interaction between proteins and surrounding molecules, such as water and lipid molecules. Here, we present a new method, named Switching G[o]-Martini, to simulate large-scale protein conformational transitions between different states, based on the switching G[o] method and the coarse-grained Martini 3 force field. The method is straight-forward and efficient, as demonstrated by the benchmarking applications for multiple protein systems, including glutamine binding protein (GlnBP), adenylate kinase (AdK), and {beta}2-adrenergic receptor ({beta}2AR). Moreover, by employing the Switching G[o]-Martini method, we can not only unveil the conformational transition from the E2Pi-PL state to E1 state of the Type 4 P-type ATPase (P4-ATPase) flippase ATP8A1-CDC50, but also provide insights into the intricate details of lipid transport.
Sabei, A.; Caldas Baia, T. G.; Saffar, R.; Martin, J.; Frezza, E.
Show abstract
We investigated the capability of internal normal modes to reproduce RNA dynamics and predict observed RNA conformational changes, and, notably, those induced by the formation of RNA-protein and RNA-ligand complexes. Here, we extended our iNMA approach developed for proteins to study RNA molecules using a simplified representation of RNA structure and its potential energy. Three datasets were also created to investigate different aspects. Despite all the approximations, our study shows that iNMA is a suitable method to take into account RNA flexibility and describe its conformational changes opening the route to its applicability in any integrative approach where these properties are crucial.
Schulze, M.; Khakhula, T.; Piasentin, N.; Aureli, S.; Rizzi, V.; Gervasio, F. L.
Show abstract
Selecting appropriate collective variables (CVs) is a crucial bottleneck in enhanced sampling molecular dynamics (MD) simulations. Although progress has been made with data-driven and intuition-based approaches, optimal CVs remain system-specific. Meanwhile, simple geometric descriptors are still widely used due to their transferability. A promising, yet under-explored, candidate for a more efficient CV is solvation. Indeed, despite its central role in ligand binding and folding, the complexity of solvent behavior has hindered its widespread use. Here, we introduce a data-driven and automatic strategy to construct robust solvation-based CVs. Our method identifies critical hydration sites by analyzing the radial distribution function of water around a ligand. Remarkably, using only these hydration CVs within on-the-fly probability enhanced sampling (OPES) simulations, we successfully converge the binding free energy landscapes for a series of host-guest systems. These landscapes show excellent agreement with those from more computationally expensive benchmark methods. We further demonstrate that the choice of where to bias water is key to efficient convergence, providing clear guidelines for implementation. This work not only underscores the central role of water in molecular recognition but also offers a powerful and generalizable framework for enhancing the sampling of complex biomolecular events.